Oligonucleotide library encoding randomised peptides

ABSTRACT

The invention relates to a method of producing an oligonucleotide library comprising a plurality of oligonucleotides, each oligonucleotide in the library having at least one predetermined position, a randomisation codon selected from a defined group of codons, the codons within said defined group coding for different amino acids. Vector, host cells containing such libraries and kits for the production of such libraries are also provided.

The invention relates to the production of oligonucleotide librariesencoding randomised peptides, vectors and host cells containing suchlibraries, and kits for the production of such libraries.

Combinatorial techniques for the production of peptides have beenavailable since the 1960's and have included solid-phase synthesis ofpeptides and additionally the parallel solid-phase synthesis methodsdeveloped in the 1980's. Such techniques are reviewed in the article byPinilla C., et al. (Nature Medicine (2003), Vol. 9, pages 118-122).

The production of DNA libraries coding for different peptides is itselfwell-known in the art. Randomised gene libraries have little in commonwith conventional genomic or cDNA libraries. Conventional librariesconsist of clones that collectively cover an entire genome/transcriptomeand are generally screened by nucleic acid hybridisation. In contrast,randomised libraries generally contain variations of a gene or afragment of a gene, which is screened for novel activity. The randomisedgenes are expressed and screened conventionally, for example, in phage,bacterial or in vitro display techniques.

Conventionally, standard gene randomisation techniques obligate thecloning of an excess of genes using either NNN or NN^(G)/_(T) codons,where N=A, C, G and T, 64 or 32 codons must be cloned, respectively, toensure representation of all 20 amino acids. At first, these numbers mayappear relatively small but if we consider multiple positions ofrandomisation there is an exponential relationship between the number ofgenes cloned and the number of proteins obtained. Hine et al havepreviously described an alternative method for producing DNA libraries,‘MAX’ randomisation, in which two or more positions are randomised suchthat all 20 amino acids, or a subset thereof, are represented (PCTpublication WO 00/15777). This methodology requires a pool ofoligonucleotides, for each position to be randomised, which arehybridised to a template, conventionally (NNN) randomised at theappropriate positions. This approach enables each amino acid to beencoded only once within the pools of oligonucleotides, hence the numberof unique genes generated is equivalent to the number of proteinsencoded, regardless of the number of positions of randomisation.Although this methodology represents a significant improvement overtraditional techniques, there remains a relatively high percentage ofnon-MAX (unwanted) codons (˜10%) at the randomised positions.Additionally, due to the constraints of the hybridisation process onlysmall quantities of the DNA constructs are produced, which is difficultto manipulate particularly when encoding subsets of amino acids.

Subsequently Hine et al have made improvements to this methodology whichvirtually eliminate the presence of unwanted sequences and increase theyield of DNA (Hine, A. V., Hughes, M. D., Nagel, D. A., Ashraf, M. andSantos, A. F. (2002). MAX Codon Gene Libraries. WO 03/106679; Hughesm.p., Nagel D. A., Santos A. F., Sutherland A. J. and Hine A. V. (2003).Removing the redundancy from randomised gene libraries. J. Mol. Biol.331 (5), 973-979). This methodology employs some additionaloligonucleotides which hybridise to the template strand but also providean extension which is not complementary to the template strand. Thisenables the selective amplification of the required encoding strand thusincreasing the yield and minimising unwanted sequences.

The problem associated with prior art methods has been the production ofmore than two contiguous randomised codons. This is important to allowstretches of several amino acids in a sequence to be randomised.Variations of the methods described in, for example, WO 03/106679 onlyallow two continuous codons to be randomised as they require the use offlanking oligonucleotide sequences to hybridise to the template strands.Furthermore, such methods require the production of a randomisedtemplate strand to be produced, prior to using selectionoligonucleotides. Potentially this limits the number of codons that canbe randomised because of the complexity/mass of the templateoligonucleotide.

The inventors tried a number of techniques in which to randomise threeor more consecutive oligonucleotides which proved to be difficult tocarry out or alternatively produce some satisfactory results. Theseincluded trying to ligate three or more MAX trinucleotides randomly intoa single-strand oligonucleotide using RNA ligase. This latter techniqueproved to work very poorly and subsequently PCR amplification led tosmears in which individual species could not be isolated by theinventors. The addition of MAX codons, containing 3 nucleotides, ligateddirectly onto the blunt end of an oligonucleotide also proved to bedifficult.

The technique now identified and described by the inventors unexpectedlyallows the production of contiguous randomised codons without the needto produce randomised template oligonucleotides.

The invention provides a method of producing an oligonucleotide librarycomprising a plurality of oligonucleotides, each oligonucleotide in thelibrary having at least one predetermined position, a randomisationcodon selected from a defined group of codons, the codons within saiddefined group coding for different amino acids, said method comprisingthe steps of:

(a) Providing one or more double-stranded starter oligonucleotides;

(b) Providing a plurality of different double stranded randomisationoligonucleotides comprising:

-   -   (i) a coding strand, the coding strand comprising a        randomisation codon; and    -   (ii) a substantially complementary non-coding strand,

wherein each double-stranded randomisation oligonucleotide comprises anucleotide sequence coding for a restriction endonuclease recognitionsite capable of being recognised by a restriction endonuclease, therestriction endonuclease capable of cleaving the randomisationoligonucleotide upstream or downstream of the endonuclease recognitionsite at a predetermined cleavage site to create a blunt ended cut;

(c) Ligating each double-stranded starter oligonucleotide to adouble-stranded randomisation oligonucleotide to form ligatedoligonucleotides;

(d) Amplifying the ligated oligonucleotides;

(e) Digesting the ligated oligonucleotide with the restrictionendonuclease to form a plurality of randomised double-strandedoligonucleotides, each of which comprise, at one end, a randomisationcodon (and its complementary sequence); and, optionally,

(f) Using the randomised double-stranded oligonucleotides as starteroligonucleotides and repeating method steps (a) to (e) and optionallystep (f) to produce a plurality of randomised double-strandedoligonucleotides, each comprising an additional randomisation codon.

The randomisation oligonucleotides differ by having differentrandomisation codons.

The oligonucleotides used are preferably DNA, however otherdouble-stranded nucleotides, such as RNA, or analogues of DNA or RNA maybe used.

Preferably, the double-stranded starter oligonucleotides comprise ablunt end onto which the double-stranded randomisation oligonucleotidesare ligated.

The randomisation codon on the coding strand and the complementary codonon the non-coding strand preferably form a blunt end on therandomisation oligonucleotides and are preferably ligated to the bluntend of the double-stranded starter oligonucleotide.

At least a portion of the starter oligonucleotide to which therandomisation codon attaches may encode a part of a gene or othernucleotide sequence encoding a predetermined amino acid sequence.

Preferably, the randomisation oligonucleotides and the starteroligonucleotides are ligated by DNA ligase. DNA ligases are well-knownin the art. For example, E. coli and phage T4 encode an enzyme, DNAligase, which seals single-stranded nicks between adjacentoligonucleotides in a duplex DNA chain. The requirement of the differentenzymes are well-known. For example, T4 enzyme requires ATP whilst theE. coli enzyme requires NAD⁺. In each case, the cofactor is split andforms an enzyme-AMP complex. The complex binds either side of the DNAstrands to be joined and makes a covalent bond between a 5′-phosphate onone strand and a 3′-OH group on the adjacent strand.

Hence, preferably at least one of the 5′ ends of the oligonucleotides tobe ligated comprises a phosphate group to allow the oligonucleotides tobe ligated by a DNA ligase.

Preferably, both of the 5′ ends of the oligonucleotides to be ligatedcomprise a phosphate group. Preferably the adjacent strand to thephosphate group(s) will contain a 3′-OH group.

Preferably, the predetermined cleavage site for the endonuclease isimmediately adjacent to the randomisation codon.

The ligated oligonucleotides may be amplified by, for example, PCR usingprimers complementary to e.g. sequences on the starter oligonucleotides.

The PCR product produced may be purified and isolated, for example,using conventional techniques such as polyacrylamide gel electrophoresis(PAGE) and excisation of the relevant band of DNA prior to isolation ofthe DNA and digestion with the restriction endonuclease.

Preferably step (f) is followed one or more times to be used to anoligonucleotide library comprising a plurality of oligonucleotides, eacholigonucleotide and library having at least two contiguous randomisedcodons. Most preferably, the number of randomised codons contained inthe library is 1 or 2, most preferably 3, 4, 5, 6, 7, 8, 9 or 10 codons.

Preferably the method provides ligating a double stranded completionoligonucleotide to the starter oligonucleotide after the required numberof randomised codons have been added. The completion oligonucleotide maycomprise a predefined nucleotide sequence comprising a restrictionendonuclease recognition site to allow the randomised nucleotidesequence to be spliced into a gene of interest. Alternatively, forexample, the completion oligonucleotide may encode a non-randomisedfragment of a gene of interest.

A nucleotide sequence attached to the randomisation codon of therandomisation oligonucleotide may be different in each round of addingthe randomisation codons. That is each set of randomisationoligonucleotides used in each round of steps (a) to (f), when the methodis repeated to add further randomised codons, may contain a differentsequence attached to the randomised codon. This allows randomiseddouble-stranded oligonucleotides obtained after each round of randomcodon addition to be selectively amplified by PCR with a primercomplementary to the different sequence. This is expected to reduce theneed for PAGE purification between random codon addition cycles.

Preferably, the restriction endonuclease recognition site and cleavagesite is:

(SEQ. ID. No. 1) 5′-GAGTCNNNNN{circumflex over ( )}-3′ (SEQ. ID. No. 2)3′-CTCAGNNNNN{circumflex over ( )}-5′

where—N=any nucleotide

-   -   ̂=the restriction endonuclease cleavage site.

Such a restriction endonuclease site has been identified as beingrecognised by two restriction endonucleases: MlyI, which is obtainablefrom New England Biolabs, Inc. and SchI, available from Fermentas LifeSciences. The two enzymes are produced by different micro-organisms.MlyI, for example, is described in U.S. Pat. No. 6,395,531 and is foundin Micrococcus lylae. Both of the endonucleases recognise thedouble-stranded DNA sequence 5′-GAGTC 3′ and cleave DNA 5 basesdownstream generating blunt ends. However, any restriction endonucleasewhich binds to a restriction endonuclease sequence but cuts adjacent toor a number of nucleotides upstream or downstream of the recognitionsequence to allow separation of the randomisation codon may also beused.

The genetic coding for different amino acids is well-known, for examplethe genetic code for codons in mRNA transcribed from the codon DNAstrand is:

*Chain-terminating, or “nonsense” codons. **Also used to specify theinitiator formyl-Met-tRNAMet. The Val triplet GUG is therefore“ambiguous” in that it codes both valine and methionine.

The different amino acids coded by different codons have differentproperties. In some circumstances it may be desirable to focus thelibrary towards different groups of amino acids with similar properties.Hence, preferably the codons are selected. For example, they may befocused towards very hydrophobic amino acids (such as Val, Ile, Leu,Met, Phe, Trp or Cys.), less hydrophobic amino acids (Ala, Tyr, His,Thr, Ser, Pro and Gly), part-hydrophobic amino acids (Arg and Lys),sulphur-containing amino acids (Cys), positively charged amino acids(Arg, His and Lys), negatively-charged amino acids (Asp and Glu), etc.Alternatively, they may be selected to avoid a particular amino acid,such as Pro or Cys.

Preferably at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,17, 18, 19 or 20 different randomisation codons are used in each roundof randomisation codon addition.

Not all codons are efficiently used in any particular organism. Hence,preferably the codons are specifically selected to allow the randomisedDNA library to be efficiently utilised in any host cell used forexpressing the DNA library and producing peptides from it. The problemsassociated with not utilising randomised libraries efficiently aresummarised in the article by Hughes M. D., et al. (J. Mol. Biol. (2003),Vol. 331, pages 973-979).

Preferably, the codons are selected for favoured codons for theexpression of each amino acid in a selected organism. For example, forE. coli, the codons are preferably GCG (A), TGC (C), GAT (D), GAA (E),TTT (F), GGC (G), CAT (H), ATT (I), AAA (K), CTG (L), ATG (M), AAC (N),CCG (P), CAG (Q), CGC (R), AGC (S), ACC (T), GTG (V), TGG (W) and TAT(Y). The letters in brackets indicate the amino acid coded by the codon.

Preferably, the 3′ end of the coding strand comprises a blocking groupto prevent nonproductive ligation of, for example, multiple copies ofthe randomisation oligonucleotides. The blocking group may be selectedfrom an amino acid, a phosphate group, a glyceryl moiety, a thiol groupor, for example, a polyethylene glycol moiety.

Additionally, or alternatively, the non-coding strand may comprise, atit's 5′ end, one or more nucleotides extending beyond the 3′ end of thecomplementary coding strand. This again reduces the number ofnonproductive ligations occurring.

Preferably, the methods of the invention additionally comprise the stepsof:

(g) Providing a randomised double-stranded starter oligonucleotideproduced by a method according to any preceding claim;

(h) Providing a predefined oligonucleotide comprising:

-   -   (i) a coding strand, the coding strand comprising a predefined        codon coding for a predefined amino acid; and    -   (ii) a substantially complementary non-coding strand,

wherein the predefined oligonucleotide comprises a nucleotide sequencecoding for a restriction endonuclease recognition site capable of beingrecognised by a restriction endonuclease, the restriction endonucleasecapable of cleaving the predefined oligonucleotide upstream ordownstream of the endonuclease recognition at a predetermined cleavagesite to create a blunt ended cut;

(i) Ligating the randomised double-stranded starter oligonucleotide tothe predefined oligonucleotide to form a ligated oligonucleotide;

(j) Amplify the ligated oligonucleotide; and

(k) Digest the ligated oligonucleotide with the restriction endonucleaseto form a randomised oligonucleotide comprising at one end thepredefined codon.

The predefined codon is preferably at one end of the predefinedoligonucleotide.

This allows the insertion of a known amino acid into a predeterminedposition of the peptide produced by the oligonucleotide libraries. Theadvantage of this technique is that it allows the effect of anyparticular amino acid in that predefined library to be studied. Examplesof such positional scanning libraries are discussed in, for example, thearticle by Pinilla (Nature Medicine (2003), Vol. 9, pages 118-122). Themethod of the invention allows the production of such positionalscanning libraries to be achieved relatively rapidly and relativelyeasily.

One or more additional random codons may be added to an end of thepredefined codon using the methods of the invention. The restrictionendonuclease recognition site, restriction endonuclease, blocking groupand/or optional extension groups to the 3′ end of the non-coding strandmay be as defined above.

The randomised oligonucleotide comprising the predefined codon may thenbe used as a starter oligonucleotide to add one or more additionalrandom codons.

Preferably, the coding strand of the randomisation oligonucleotide usedin the method of the invention comprises the sequence:

(i) 5′ XXX{circumflex over ( )}(N)_(a)R′(N)_(b)-B 3′ or (ii)5′-(N)_(b)R(N)_(a){circumflex over ( )}XXX-3′

where: XXX is the randomisation codon,

-   -   N is any nucleotide,    -   a is an integer of 0 to 10, preferably 5,    -   b is an integer of 0 to 40, preferably 1 to 20, or preferably 1        to 10,    -   ̂ is the restriction endonuclease cleavage site,    -   R′ is the reverse complement of the restriction endonuclease        recognition site sequence,    -   R is the restriction endonuclease recognition site sequence,    -   B may or may not be present and may be selected from —OH, —NH₂,        phosphate, a glyceryl moiety, a thiol and a polyethylene glycol        moiety.

The second randomised oligonucleotide (ii) may be used to ligate to the5′ end of the starter oligonucleotide.

Preferably, the oligonucleotides obtained by the methods of theinvention are inserted into a suitable expression vector. Expressionvectors, for example for expressing the peptides encoded by theoligonucleotides, are well-known in the art. For example, phageexpression libraries in which oligonucleotides are inserted into thenucleotide sequence coding for phage coat proteins, so that the peptidesencoded by them are expressed on the surface of phage particles, arewell-known in the art. Additionally, bacterial surface expressionvectors, yeast expression vectors and other eukaryotic expressionvectors are well-known in the art.

Preferably the randomisation oligonucleotides used in each cycle ofcodon addition differ by having a different sequence, (N)_(b). That is,in a first cycle the randomisation oligonucleotides have a firstsequence (N)_(b) with different randomisation codons. In a second,subsequent cycle the randomisation oligonucleotides have a secondsequence (N)_(b)′ attached to randomisation codons. This allowsrandomised oligonucleotides obtained after each randomised codonaddition cycle to be amplified with a primer specific for the sequence(N)_(b)′ or (N)_(b).

Preferably, the expression vector is inserted into a suitable expressionhost cell, such as a bacterial cell (e.g. E. coli), yeast, etc. The hostcell expresses the peptide encoded by the DNA library which is then usedfor further study.

Methods for producing a randomised peptide library comprising expressionof the oligonucleotide library obtained by a method according to theinvention are also provided.

The protein libraries obtainable by the methods of the invention arealso provided.

The invention also provides a randomised oligonucleotide librarycomprising a plurality of oligonucleotides, each oligonucleotide having3 or more contiguous randomised MAX codons which represent the optimumcodon usage of a predetermined organism, and wherein the MAX codons aredifferent between different members of the library.

Preferably, the randomised library comprises 1 or 2, most preferably 3,4, 5, 6, 7, 8, 9 or 10 MAX codons. Kits for producing an oligonucleotideby a method according to the invention are also provided.

Preferably, the kit comprises a plurality of different randomisationoligonucleotides comprising:

-   -   (i) a coding strand, the coding strand comprising a        randomisation codon; and    -   (ii) a substantially complementary non-coding strand,

wherein each double-stranded randomisation oligonucleotide comprises anucleotide sequence coding for a restriction endonuclease recognitionsite capable of being recognised by a restriction endonuclease, therestriction endonuclease capable of cleaving the randomisationoligonucleotide upstream or downstream of the endonuclease recognitionat a predetermined cleavage site to create a blunt ended cut.

Preferably, the recognition site is:

(SEQ. ID. No. 1) 5′-GAGTCNNNNN{circumflex over ( )}-3′ (SEQ. ID. No. 2)3′-CTCAGNNNNN{circumflex over ( )}-5′

where—N=any nucleotide

-   -   ̂=the restriction endonuclease cleavage site.

The kit may additionally comprise a restriction enzyme capable ofcleaving the randomisation oligonucleotide and the predeterminedcleavage site. Preferably, the restriction endonuclease is selected fromSchI and MlyI.

Preferably, the randomisation codons consist of MAX codons whichrepresent the optimum codon usage of a predetermined organism ofinterest or a predetermined selection of said MAX condons.

The coding strand is preferably of the double-stranded randomisationoligonucleotide comprising a 5′ end and a 3′ end, the 3′ end of thecoding strand comprising a blocking group.

Preferably, the blocking group is selected from an amino group, aphosphate group, a glycerol moiety, a thiol group and a polyethyleneglycol moiety.

Preferably, the non-coding strand comprises a 3′ end and a 5′ end, the5′ end of the non-coding strand extending one or more nucleotides beyondthe 3′ end of the complementary coding strand.

Preferably, the coding strand comprises the sequence:

(i) 5′ XXX{circumflex over ( )}(N)_(a)R(N)_(b)-B 3′ or (ii)5′-(N)_(b)R(N)_(a){circumflex over ( )}XXX-3′

where: XXX is the randomisation codon,

-   -   N is any nucleotide,    -   a is an integer of 0 to 10, preferably 5,    -   b is an integer of 0 to 40, preferably 1 to 20, or preferably 1        to 10,    -   ̂ is the restriction endonuclease cleavage site,    -   R′ is the reverse complement of the restriction endonuclease        recognition site sequence,    -   R is the restriction endonuclease recognition site sequence,    -   B may or may not be present and when present may be selected        from —OH, —NH₂, phosphate, a glyceryl moiety, a thiol and a        polyethylene glycol moiety.

The kit may additionally comprise a predefined oligonucleotidecomprising:

-   -   (i) a coding strand, the coding strand comprising a predefined        codon coding for a predefined amino acid; and    -   (ii) a substantially complementary non-coding strand,

wherein the predefined oligonucleotide comprises a nucleotide sequencecoding for a restriction endonuclease recognition site capable of beingrecognised by a restriction endonuclease, the restriction endonucleasecapable of cleaving the predefined oligonucleotide upstream ordownstream of the endonuclease recognition at a predetermined cleavagesite to create a blunt ended cut.

The invention also provides host cells comprising a DNA libraryobtainable by a method according to the invention and/or utilising a kitaccording to the invention.

The kits may be used to produce peptide libraries for screening theeffect of different sequences on, for example, the binding of ligands orfor the effect on a biological activity of the peptide.

The invention will now be described by way of example only withreference to the following figures.

FIG. 1 shows schematically a method of producing DNA sequencescontaining MAX codons at predetermined positions according to thepresent invention.

FIG. 2 shows the distribution of MAX codons and non-MAX codons at thepredetermined positions within DNA sequences produced by the method ofthe present invention.

FIG. 3 shows a schematic diagram indicating an alternative method of theinvention in which MAX codons are added at the opposite end of thestarter oligonucleotide to that shown in FIG. 1.

EXAMPLE

FIG. 1 shows a schematic of the method used to generate the randomisedDNA library containing MAX codons at the seven predetermined positions.MAX denotes a codon representing one of the 20 codons favoured in E.coli. NH₂ represents an amine group present at the 3′ end of thisoligonucleotide in order to minimise non-productive ligations. The greybox represents the MlyI restriction endonuclease site.

The principal steps involved in the production of the library are:

-   -   1. Hybridisation of the complementary pairs of oligonucleotides    -   2. Ligation of the starter A and randomisation oligonucleotides        B    -   3. PCR amplification of the ligation product    -   4. Digestion with MlyI and purification of product by PAGE of        product C    -   5. Repetition of the above process to achieve the desired number        of contiguous randomised positions.    -   6. Cloning the DNA constructs into an appropriate vector

Hybridisation, ligation and cloning were performed as described belowand the constructs transformed into E. coli DH5a (genotype: F.′80dlacZ(lacZYA-argF)U169 deoR recA1 endA1 hsdR17(rK−, mK+)phoAsupE44−thi-1 gyrA96 relA1/F′ proAB+lacIqZM15 Tn10(tetr)) chemicallycompetent cells , which were induced to take up DNA by heat shock.Clones were isolated and plasmid DNA recovered. The insert DNA wassequenced to establish the sequences present at the predeterminedpositions.

Any suitable host cell, and indeed alternative expression vectors, maybe used instead of this strain of E. coli.

Materials and Methods

Starter DNA Synthesis

A fully complementary pair of starter DNA oligonucleotides weresynthesised by MWG Biotech.

Randomisation Oligonucleotide Synthesis

Fully complementary pairs of randomisation oligonucleotides weresynthesised by MWG Biotech. Randomisation oligonucleotides were designedto encode a single MAX codon at the 5′ end and an amino group at the 3′end of the coding strand. The non-coding strand was not amino modified.

Phosphorylation

5′ phosphorylation reactions of the coding strands of the randomisationoligonucleotides were set up in a final volume of 50 μl unless otherwisespecified. The reactions consisted of 1× Ligase buffer (NEB—New EnglandBiolabs), 10 units T4 PNK, 300 pmoles of DNA and water to a final volumeof 50 μl. The reaction was incubated at 37° C. for 30 min and then thereaction stopped by raising the temperature to 65° C. for 20 min.

Hybridisation

Hybridisation reactions were set up using equal amounts of twooligonucleotides. The final volume of the reaction was 50 μl. Thisreaction was heated to 95° C. and held at that temperature for 2 min.The temperature was allowed to decrease by 1° C./min until 4° C. was toallow the complementary sequences to hybridise.

Ligations

Blunt end ligations were carried out between the starter and pool of 20randomisation oligonucleotides. Ligations were set up using equalquantities of forward and reverse oligonucleotides (50 pmoles), 1×Ligase buffer (NEB), 10 units ligase (NEB) and water to a final volumeof 20 il. The reaction was incubated at 26° C. overnight.

Polymerase Chain Reaction (PCR).

PCR reactions contained 1 unit of Pfu polymerase (Promega), 50 pmoles ofeach of the PCR primers, 1 μl of starter, 200 μM dNTP's, 1× Pfu buffer,and double distilled water to make the volume up to 100 μl. DNA wasamplified using the following conditions: 94° C. for 30 seconds, 48° C.for 30 seconds, 72° C. for one minute for 35 cycles. The reactions werecompleted at 72° C. for 7 minutes and the samples stored at 4° C.

Phenol Chloroform Extraction of DNA.

A 0.1 volume of 3M sodium acetate (pH 5.2) was added to the DNA beingpurified reaction which was then mixed by vortexing. Subsequently onevolume of phenol/chloroform/iso-amyl-alcohol (25:24:1) was added, andthe sample vortexed and centrifuged for 2 min at 14000 rpm. The aqueouslayer containing the DNA was removed carefully to a clean microfugetube. One volume of chloroform was added and the resultant mixturevortexed and centrifuged for 2 min at 14000 rpm. The aqueous layerremoved to a clean microfuge tube and 2 volumes of ice cold ethanol wasadded and the sample vortexed. The microfuge tube was placed at −20° C.overnight or at −70° C. for 1 hour and then allowed to thaw prior tocentrifugation for 20 minutes at 14000 rpm. The supernatant was removedand the DNA pellet was washed with 200 μl of 70% ethanol. Thesupernatant was removed and the pellet was allowed to air dry prior toresuspension in distilled water. Samples were stored at −20° C. untilrequired.

Restriction Digests of DNA.

Restriction digests were set up in 1× appropriate buffer (according tomanufacturers' instructions) with 0.1 μg/μl of DNA, 10 units ofrestriction enzyme and made up to 20 il with double distilled water. Therestriction digests were incubated at 37° C. or 55° C. for two hours asrecommended by the manufacturer. The temperature was then raised to 65°C. for twenty minutes to denature the enzyme. Double digests were set upin the same way using the buffer most appropriate for the combination ofenzymes.

Polyacrylamide Gel Electrophoresis.

Denaturing PAGE gels were prepared by dissolving 21 g urea, 12.5 ml of40% acrylamide (19:1 acrylamide: bisacrylamide ratio), 5 ml TBE (10×)(0.9 MTris base, 0.9M Boric acid and 20 mM EDTA) and making to a finalvolume of 50 ml with water. This was stirred vigorously until the ureahad dissolved, 60 μl of 10% ammonium persulphate was added andsubsequently 80 μl of TEMED (N,N,N′,N′, tetraethylmethylethylenediamine). The electrophoresis apparatus was set up as per manufacturers'instruction and the gel poured and set. The samples were then loadedonto the gel and a voltage of 20 V/cm applied for approximately 3 hours.The gel was then removed from the glass plates and placed into 1× TBEbuffer with ethidium bromide at a concentration of 2 μg/ml. This wasthen placed on a shaker at 150 rpm and left for five minutes. The gelwas then removed from the buffer and photographed under U.V light. Whena non-denaturing PAGE was required, the gel was made in the same way butthe urea was omitted.

Elution of DNA from PAGE.

The band of interest was excised from the gel and placed in aDNA/protein elution column. 1× TAE buffer (100 mM Tris base, 19 mMacetic acid and 0.2 mM EDTA) was added to the column and the lid appliedtightly to prevent any leakage. The column was then immersed into a gelelectrophoresis tank filled with 1× TAE buffer. A voltage of 15 V/cm wasapplied for 20 min to allow the DNA to elute from the gel. After 20 minthe current was reversed for 30 s to release the DNA from the membraneof the elution column into the buffer and the column removed from therack. The buffer was carefully removed from the elution column and theDNA precipitated.

0.1 volume of 3M sodium acetate (pH 5.2) and 2 μl pellet paint wereadded to the extracted buffer. Two volumes of 100% ethanol were addedand the sample was vortexed and incubated at room temperature for 1 min.The sample was then centrifuged at 14000 rpm for five min, thesupernatant was removed, and the pellet was washed with 200 il of 70%ethanol. The supernatant was removed and the pellet was dried prior toresuspension in an appropriate amount of double distilled water.

Preparation of Chemically Competent E. coli (DH5α)

A single colony of E. coli (DH5α) was inoculated into 10 ml of SOB andincubated overnight at 37° C. in a shaker at 250 rpm. 8 ml of theovernight culture was transferred into 800 ml of LB. This was incubatedat 37° C. in a shaker at 250 rpm until the culture was midway throughthe log phase (OD550 of ˜0.45). The cells were then chilled on ice for30 minutes and then pelleted by centrifugation at 4° C. The supernatantwas then removed and the cells resuspended by gentle pipetting in 264 mlRF1 (100 mM RbCl, 50 mM MnCl₂, 30 mM potassium acetate, 10 mM CaCl₂, 15%glycerol, adjusted to pH 5.8 with 0.2M acetic acid). The resuspendedcells were then incubated on ice for one hour. The cells were pelletedagain and the supernatant removed. The cells were resuspended in 64 mlRF2 (10 mM MOPS (4-morpholinepropanesulfonic acid), 10 mM RbCl, 75 mMCaCl₂, 15% glycerol, adjusted to pH 6.8 with NaOH) and incubated on icefor 15 minutes. They were dispensed in 200 μl aliquots in microfugetubes which were then flash frozen in liquid nitrogen and stored at −70°C. until required.

Transformation

The competent E. coil (DH5α) were thawed on ice and 100 μl was added tothe ligation mix (20 μl) swirled to mix and allowed to incubate on icefor 30 minutes. The cells were heat shocked at 37° C. for 45 seconds andreturned to ice for a further two minutes. 100 μl of 2× LB was added tothe cells and this allowed to incubate at 37° C. in a shaker at 250 rpmfor 1 hour. 10 μl of IPTG and 50 μl of 2% X-gal was added to the cellsprior to plating out on selective media.

Plasmid DNA Preparation.

Plasmid DNA was prepared for sequencing using Promega Wizard miniprepkit according to manufacturers' instructions.

DNA Sequencing

Automated DNA sequencing was performed by the Birmingham UniversityGenomics laboratory on an ABI 3700 sequencer.

Results

FIG. 2 shows the distribution of the different MAX codons obtained atthe predetermined positions in the isolated clones. A total of 156clones were sequenced, giving a total of 1092 MAX encoding positions.All 20 of the encoded sequences have been represented within the clonesanalysed. The column labelled ‘non-MAX’ refers to codons which havearisen but were not specified in the randomisation mix. The columnlabelled ‘N’ refers to those codons in which the sequence could not bedetermined due to a lack of clarity in the sequencing. The columnlabelled ‘Del.’ refers to those codons which were either not present orcontained a deletion.

The technique has at this stage been used to produce at least 7contiguous randomised codons.

In an alternative embodiment the MAX codon may be added at the oppositeend of the starter oligonucleotide.

Conclusion

The technique provides a method of producing randomised oligonucleotideshaving randomised codons.

Alternative Method

FIG. 3 shows an alternative method of adding MAX codons. the MAX codonsare added at the opposite end of the starter oligonucleotide to thatshown in FIG. 1. The annotations are the same as for FIG. 1. Thisdemonstrates that the technique may be used to introduce MAX codons ateither end of a coding strand.

PAGE purification is bracketed as in this, and indeed in the systemshown in FIG. 1, the need for PAGE may be reduced by using differentrandomisation oligonucleotides with different sequences up- ordown-stream of the MAX codon in each round of MAX codon addition. Thisallows randomised double-stranded oligonucleotides obtained after eachround of codon addition to be selectively amplified by PCR with a primercomplementary to the different sequence.

1.-20. (canceled)
 21. A kit for producing an oligonucleotide library bya method, said method comprising a plurality of oligonucleotides, eacholigonucleotide in the library having at least one predeterminedposition, a randomisation codon selected from a defined group of codons,the codons within said defined group coding for different amino acids,said method comprising the steps of: (a) Providing one or moredouble-stranded starter oligonucleotides, wherein the starteroligonucleotides have one or more blunt ends; (b) Providing a pluralityof different double stranded randomisation oligonucleotides comprising:(i) a coding strand, the coding strand comprising a randomisation codon;and (ii) a substantially complementary non-coding strand, wherein eachdouble stranded randomisation oligonucleotide comprises a nucleotidesequence coding for a restriction endonuclease recognition site capableof being recognised by a restriction endonuclease, the restrictionendonuclease capable of cleaving the randomisation oligonucleotideupstream or downstream of the endonuclease recognition site at apredetermined cleavage site to create a blunt ended cut; (c) Ligatingeach double-stranded starter oligonucleotide to a double-strandedrandomisation oligonucleotide to form ligated oligonucleotides; (d)Amplifying the ligated oligonucleotides; (e) Digesting the ligatedoligonucleotide with the restriction endonuclease to form a plurality ofrandomised double-stranded oligonucleotides, each of which comprise, atone end, a randomisation codon; and, optionally, (f) Using therandomised double-stranded oligonucleotides as starter oligonucleotidesand repeating method steps (a) to (e) and optional step (f) to produce aplurality of randomised double-stranded oligonucleotides, eachcomprising an additional randomisation codon; the kit comprises aplurality of different randomisation oligonucleotides comprising: (i) acoding strand, the coding strand comprising a randomisation codon; and(ii) a substantially complementary non-coding strand, wherein eachdouble stranded randomisation oligonucleotide comprises a nucleotidesequence coding for a restriction endonuclease recognition site capableof being recognised by a restriction endonuclease, the restrictionendonuclease capable of cleaving the randomisation oligonucleotideupstream or downstream of the endonuclease recognition site at apredetermined cleavage site to create a blunt ended cut.
 22. A kitaccording to claim 21, wherein the restriction endonuclease recognitionsite is: (SEQ. ID. No. 1) 5′-GAGTCNNNNN{circumflex over ( )}-3′(SEQ. ID. No. 2) 3′-CTCAGNNNNN{circumflex over ( )}-5′

where—N=any nucleotide ̂=the restriction endonuclease cleavage site. 23.A kit according to claim 21 comprising a restriction enzyme capable ofcleaving the randomisation oligonucleotide at the predetermined cleavagesite.
 24. A kit according to claim 23, wherein the restrictionendonuclease is selected from SchI and MlyI.
 25. A kit according toclaim 21 wherein the randomisation codons consist of MAX codons whichrepresent the optimum codon usage of a predetermined organism ofinterest or a predetermined selection of said MAX condons.
 26. A kitaccording to claim 21, the coding strand of the double-strandedrandomisation oligonucleotide comprising a 5′ end and a 3′ end, the 3′end of the coding strand comprising a blocking group.
 27. A kitaccording to claim 26, wherein the blocking group is selected from anamino group, a phosphate group, a glycerol moiety, a thiol group and apolyethylene glycol moiety.
 28. A kit according to claim 21, wherein thenon-coding strand comprises a 3′ end and a 5′ end, the 5′ end of thenon-coding strand extending one or more nucleotides beyond the 3′ end ofthe complementary coding strand.
 29. A kit according to claim 21,wherein the coding strand comprises the sequence: (I) 5′ XXX{circumflexover ( )}(N)_(a)R(N)_(b)-B 3′ or (ii) 5′-(N)_(b)R(N)_(a){circumflex over( )}XXX-3′

where: XXX is the randomisation codon, N is any nucleotide, a is aninteger of 0 to 10, preferably 5, b is an integer of 0 to 40, preferably1 to 20, or preferably 1 to 10, ̂ is the restriction endonucleasecleavage site, R′ is the reverse complement of the restrictionendonuclease recognition site sequence, R is the restrictionendonuclease recognition site sequence, B may or may not be present andwhen present may be selected from —OH, —NH₂, phosphate, a glycerylmoiety, a thiol group and a polyethylene glycol moiety.
 30. A kitaccording to claim 21 additionally comprising a predefinedoligonucleotide comprising: (i) a coding strand, the coding strandcomprising a predefined codon coding for a predefined amino acid; and(ii) a substantially complementary non-coding strand, wherein thepredefined oligonucleotide comprises a nucleotide sequence coding for arestriction endonuclease recognition site capable of being recognised bya restriction endonuclease site, the restriction endonuclease capable ofcleaving the predefined oligonucleotide upstream or downstream of theendonuclease recognition at a predetermined cleavage site to create ablunt ended cut.
 31. A kit according to claim 21 additionally comprisinga completion oligonucleotide having a predefined sequence. 32.(canceled)
 33. (canceled)